Skip to content

Conversation

@maxzuo
Copy link

@maxzuo maxzuo commented Aug 14, 2025

PR Description

Adds support for Seq2Seq models: AutoModelForSeq2SeqLM.

Why

Seq2Seq models are not directly supported, despite support for all model architectures. This is because FastModel.from_pretrained sets the auto_model parameter to either AutoModelForCausalLM or AutoModelForVision2Seq/AutoModelForImageTextToText.

Further, since models like T5 have class names ending in ForConditionalGeneration, unsloth registers this as a VLM and tries to load it as such.

I use AutoModelForSeq2SeqLM._model_mapping to check if a model config is registered as a Seq2Seq model. This logic can be extended to other auto models (e.g., AutoModelForSequenceClassification) if desired.

Links

Support for T5 has some community interest:

This was referenced Aug 14, 2025
@maxzuo maxzuo marked this pull request as draft August 14, 2025 18:16
@Datta0
Copy link
Collaborator

Datta0 commented Aug 18, 2025

Hey @maxzuo thanks for the contribution
It'd be of great help if you can possibly create a notebook showing fine-tuning of any small seq2seq model on google colab.
Also I notice this PR is marked draft. Are you intending to add more things to this?

@maxzuo
Copy link
Author

maxzuo commented Aug 18, 2025

@Datta0 sure I'm actively working on it, actually why I converted this to a draft. Will let you know!

@Aman-byte1
Copy link

@maxzuo did u work on it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

please give t5 support. Support T5 models

3 participants